Search CORE

23 research outputs found

Leveraging human-computer interaction and crowdsourcing for scholarly knowledge graph creation

Author: Oelen Allard
Publication venue: Hannover : Institutionelles Repositorium der Leibniz Universität Hannover
Publication date: 23/11/2022
Field of study

The number of scholarly publications continues to grow each year, as well as the number of journals and active researchers. Therefore, methods and tools to organize scholarly knowledge are becoming increasingly important. Without such tools, it becomes increasingly difficult to conduct research in an efficient and effective manner. One of the fundamental issues scholarly communication is facing relates to the format in which the knowledge is shared. Scholarly communication relies primarily on narrative document-based formats that are specifically designed for human consumption. Machines cannot easily access and interpret such knowledge, leaving machines unable to provide powerful tools to organize scholarly knowledge effectively. In this thesis, we propose to leverage knowledge graphs to represent, curate, and use scholarly knowledge. The systematic knowledge representation leads to machine-actionable knowledge, which enables machines to process scholarly knowledge with minimal human intervention. To generate and curate the knowledge graph, we propose a machine learning assisted crowdsourcing approach, in particular Natural Language Processing (NLP). Currently, NLP techniques are not able to satisfactorily extract high-quality scholarly knowledge in an autonomous manner. With our proposed approach, we intertwine human and machine intelligence, thus exploiting the strengths of both approaches. First, we discuss structured scholarly knowledge, where we present the Open Research Knowledge Graph (ORKG). Specifically, we focus on the design and development of the ORKG user interface (i.e., the frontend). One of the key challenges is to provide an interface that is powerful enough to create rich knowledge descriptions yet intuitive enough for researchers without a technical background to create such descriptions. The ORKG serves as the technical foundation for the rest of the work. Second, we focus on comparable scholarly knowledge, where we introduce the concept of ORKG comparisons. ORKG comparisons provide machine-actionable overviews of related literature in a tabular form. Also, we present a methodology to leverage existing literature reviews to populate ORKG comparisons via a human-in-the-loop approach. Additionally, we show how ORKG comparisons can be used to form ORKG SmartReviews. The SmartReviews provide dynamic literature reviews in the form of living documents. They are an attempt address the main weaknesses of the current literature review practice and outline how the future of review publishing can look like. Third, we focus designing suitable tasks to generate scholarly knowledge in a crowdsourced setting. We present an intelligent user interface that enables researchers to annotate key sentences in scholarly publications with a set of discourse classes. During this process, researchers are assisted by suggestions coming from NLP tools. In addition, we present an approach to validate NLP-generated statements using microtasks in a crowdsourced setting. With this approach, we lower the barrier to entering data in the ORKG and transform content consumers into content creators. With the work presented, we strive to transform scholarly communication to improve machine-actionability of scholarly knowledge. The approaches and tools are deployed in a production environment. As a result, the majority of the presented approaches and tools are currently in active use by various research communities and already have an impact on scholarly communication.Die Zahl der wissenschaftlichen Veröffentlichungen nimmt jedes Jahr weiter zu, ebenso wie die Zahl der Zeitschriften und der aktiven Forscher. Daher werden Methoden und Werkzeuge zur Organisation von wissenschaftlichem Wissen immer wichtiger. Ohne solche Werkzeuge wird es immer schwieriger, Forschung effizient und effektiv zu betreiben. Eines der grundlegenden Probleme, mit denen die wissenschaftliche Kommunikation konfrontiert ist, betrifft das Format, in dem das Wissen publiziert wird. Die wissenschaftliche Kommunikation beruht in erster Linie auf narrativen, dokumentenbasierten Formaten, die speziell für Experten konzipiert sind. Maschinen können auf dieses Wissen nicht ohne weiteres zugreifen und es interpretieren, so dass Maschinen nicht in der Lage sind, leistungsfähige Werkzeuge zur effektiven Organisation von wissenschaftlichem Wissen bereitzustellen. In dieser Arbeit schlagen wir vor, Wissensgraphen zu nutzen, um wissenschaftliches Wissen darzustellen, zu kuratieren und zu nutzen. Die systematische Wissensrepräsentation führt zu maschinenverarbeitbarem Wissen. Dieses ermöglicht es Maschinen wissenschaftliches Wissen mit minimalem menschlichen Eingriff zu verarbeiten. Um den Wissensgraphen zu generieren und zu kuratieren, schlagen wir einen Crowdsourcing-Ansatz vor, der durch maschinelles Lernen unterstützt wird, insbesondere durch natürliche Sprachverarbeitung (NLP). Derzeit sind NLP-Techniken nicht in der Lage, qualitativ hochwertiges wissenschaftliches Wissen auf autonome Weise zu extrahieren. Mit unserem vorgeschlagenen Ansatz verknüpfen wir menschliche und maschinelle Intelligenz und nutzen so die Stärken beider Ansätze. Zunächst erörtern wir strukturiertes wissenschaftliches Wissen, wobei wir den Open Research Knowledge Graph (ORKG) vorstellen.Insbesondere konzentrieren wir uns auf das Design und die Entwicklung der ORKG-Benutzeroberfläche (das Frontend). Eine der größten Herausforderungen besteht darin, eine Schnittstelle bereitzustellen, die leistungsfähig genug ist, um umfangreiche Wissensbeschreibungen zu erstellen und gleichzeitig intuitiv genug ist für Forscher ohne technischen Hintergrund, um solche Beschreibungen zu erstellen. Der ORKG dient als technische Grundlage für die Arbeit. Zweitens konzentrieren wir uns auf vergleichbares wissenschaftliches Wissen, wofür wir das Konzept der ORKG-Vergleiche einführen. ORKG-Vergleiche bieten maschinell verwertbare Übersichten über verwandtes wissenschaftliches Wissen in tabellarischer Form. Außerdem stellen wir eine Methode vor, mit der vorhandene Literaturübersichten genutzt werden können, um ORKG-Vergleiche mit Hilfe eines Human-in-the-Loop-Ansatzes zu erstellen. Darüber hinaus zeigen wir, wie ORKG-Vergleiche verwendet werden können, um ORKG SmartReviews zu erstellen. Die SmartReviews bieten dynamische Literaturübersichten in Form von lebenden Dokumenten. Sie stellen einen Versuch dar, die Hauptschwächen der gegenwärtigen Praxis des Literaturreviews zu beheben und zu skizzieren, wie die Zukunft der Veröffentlichung von Reviews aussehen kann. Drittens konzentrieren wir uns auf die Gestaltung geeigneter Aufgaben zur Generierung von wissenschaftlichem Wissen in einer Crowdsourced-Umgebung. Wir stellen eine intelligente Benutzeroberfläche vor, die es Forschern ermöglicht, Schlüsselsätze in wissenschaftlichen Publikationen mittles Diskursklassen zu annotieren. In diesem Prozess werden Forschende mit Vorschlägen von NLP-Tools unterstützt. Darüber hinaus stellen wir einen Ansatz zur Validierung von NLP-generierten Aussagen mit Hilfe von Mikroaufgaben in einer Crowdsourced-Umgebung vor. Mit diesem Ansatz senken wir die Hürde für die Eingabe von Daten in den ORKG und setzen Inhaltskonsumenten als Inhaltsersteller ein. Mit der Arbeit streben wir eine Transformation der wissenschaftlichen Kommunikation an, um die maschinelle Verwertbarkeit von wissenschaftlichem Wissen zu verbessern. Die Ansätze und Werkzeuge werden in einer Produktionsumgebung eingesetzt. Daher werden die meisten der vorgestellten Ansätze und Werkzeuge derzeit von verschiedenen Forschungsgemeinschaften aktiv genutzt und haben bereits einen Einfluss auf die wissenschaftliche Kommunikation.EC/ERC/819536/E

Institutionelles Repositorium der Leibniz Universität Hannover

Recommended from our members

Crowdsourcing Scholarly Discourse Annotations

Author: Auer Sören
Oelen Allard
Stocker Markus
Publication venue: New York, NY : ACM
Publication date: 01/01/2021
Field of study

The number of scholarly publications grows steadily every year and it becomes harder to find, assess and compare scholarly knowledge effectively. Scholarly knowledge graphs have the potential to address these challenges. However, creating such graphs remains a complex task. We propose a method to crowdsource structured scholarly knowledge from paper authors with a web-based user interface supported by artificial intelligence. The interface enables authors to select key sentences for annotation. It integrates multiple machine learning algorithms to assist authors during the annotation, including class recommendation and key sentence highlighting. We envision that the interface is integrated in paper submission processes for which we define three main task requirements: The task has to be . We evaluated the interface with a user study in which participants were assigned the task to annotate one of their own articles. With the resulting data, we determined whether the participants were successfully able to perform the task. Furthermore, we evaluated the interface’s usability and the participant’s attitude towards the interface with a survey. The results suggest that sentence annotation is a feasible task for researchers and that they do not object to annotate their articles during the submission process

Repositorium für Naturwissenschaften und Technik

Recommended from our members

Creating a Scholarly Knowledge Graph from Survey Article Tables

Author: Auer Sören
Oelen Allard
Stocker Markus
Publication venue: Cham : Springer
Publication date: 01/01/2020
Field of study

Due to the lack of structure, scholarly knowledge remains hardly accessible for machines. Scholarly knowledge graphs have been proposed as a solution. Creating such a knowledge graph requires manual effort and domain experts, and is therefore time-consuming and cumbersome. In this work, we present a human-in-the-loop methodology used to build a scholarly knowledge graph leveraging literature survey articles. Survey articles often contain manually curated and high-quality tabular information that summarizes findings published in the scientific literature. Consequently, survey articles are an excellent resource for generating a scholarly knowledge graph. The presented methodology consists of five steps, in which tables and references are extracted from PDF articles, tables are formatted and finally ingested into the knowledge graph. To evaluate the methodology, 92 survey articles, containing 160 survey tables, have been imported in the graph. In total, 2626 papers have been added to the knowledge graph using the presented methodology. The results demonstrate the feasibility of our approach, but also indicate that manual effort is required and thus underscore the important role of human experts

Repositorium für Naturwissenschaften und Technik

Creating and validating a scholarly knowledge graph using natural language processing and microtask crowdsourcing

Author: Auer Sören
Oelen Allard
Stocker Markus
Publication venue: Berlin ; Heidelberg ; New York : Springer
Publication date: 01/01/2023
Field of study

Due to the growing number of scholarly publications, finding relevant articles becomes increasingly difficult. Scholarly knowledge graphs can be used to organize the scholarly knowledge presented within those publications and represent them in machine-readable formats. Natural language processing (NLP) provides scalable methods to automatically extract knowledge from articles and populate scholarly knowledge graphs. However, NLP extraction is generally not sufficiently accurate and, thus, fails to generate high granularity quality data. In this work, we present TinyGenius, a methodology to validate NLP-extracted scholarly knowledge statements using microtasks performed with crowdsourcing. TinyGenius is employed to populate a paper-centric knowledge graph, using five distinct NLP methods. We extend our previous work of the TinyGenius methodology in various ways. Specifically, we discuss the NLP tasks in more detail and include an explanation of the data model. Moreover, we present a user evaluation where participants validate the generated NLP statements. The results indicate that employing microtasks for statement validation is a promising approach despite the varying participant agreement for different microtasks

Institutionelles Repositorium der Leibniz Universität Hannover

KGMM -- A Maturity Model for Scholarly Knowledge Graphs based on Intertwined Human-Machine Collaboration

Author: Auer Sören
Hussein Hassan
Karras Oliver
Oelen Allard
Publication venue
Publication date: 22/11/2022
Field of study

Knowledge Graphs (KG) have gained increasing importance in science, business and society in the last years. However, most knowledge graphs were either extracted or compiled from existing sources. There are only relatively few examples where knowledge graphs were genuinely created by an intertwined human-machine collaboration. Also, since the quality of data and knowledge graphs is of paramount importance, a number of data quality assessment models have been proposed. However, they do not take the specific aspects of intertwined human-machine curated knowledge graphs into account. In this work, we propose a graded maturity model for scholarly knowledge graphs (KGMM), which specifically focuses on aspects related to the joint, evolutionary curation of knowledge graphs for digital libraries. Our model comprises 5 maturity stages with 20 quality measures. We demonstrate the implementation of our model in a large scale scholarly knowledge graph curation effort.Comment: Accepted as a full paper at the ICADL 2022: International Conference on Asian Digital Libraries 202

arXiv.org e-Print Archive

Measuring surface water quality using a low-cost sensor kit within the context of rural Africa

Author: De Boer Victor
Oelen Allard
Van Aart Chris
Publication venue: CEUR Workshop Proceedings
Publication date: 01/01/2018
Field of study

Monitoring water quality is done for a variety of reasons, including to determine whether water is suitable for drinking or agricultural purposes. In rural areas of Africa the traditional way of measuring water quality can be costly and time consuming. In this research, we have developed a low-cost water quality measuring device that designed to operate in the context of rural Africa. Firstly we select appropriate water quality sensors. Secondly we developed a water quality monitoring device that takes the contextual requirements and constraints of rural Africa into account. Lastly the device is evaluated and tested using water samples that were collected in rural Africa

VU Research Portal

Comparing research contributions in a scholarly knowledge graph

Author: Auer Sören
Farfar Kheir Eddine
Jaradeh Mohamad Yaser
Oelen Allard
Stocker Markus
Publication venue: Aachen : RWTH Aachen
Publication date: 01/01/2019
Field of study

Conducting a scientific literature review is a time consuming activity. This holds for both finding and comparing the related literature. In this paper, we present a workflow and system designed to, among other things, compare research contributions in a scientific knowledge graph. In order to compare contributions, multiple tasks are performed, including finding similar contributions, mapping properties and visualizing the comparison. The presented workflow is implemented in the Open Research Knowledge Graph (ORKG) which enables researchers to find and compare related literature. A preliminary evaluation has been conducted with researchers. Results show that researchers are satisfied with the usability of the user interface, but more importantly, they acknowledge the need and usefulness of contribution comparisons

Institutionelles Repositorium der Leibniz Universität Hannover

Improving Access to Scientific Literature with Knowledge Graphs

Author: Auer Sören
D’Souza Jennifer
Eddine Farfar Kheir
Haris Muhammad
Jaradeh Mohamad Yaser
Oelen Allard
Prinz Manuel
Stocker Markus
Vogt Lars
Wiens Vitalis
Publication venue: Humboldt-Universität zu Berlin
Publication date: 15/10/2020
Field of study

The transfer of knowledge has not changed fundamentally for many hundreds of years: It is usually document-based - formerly printed on paper as a classic essay and nowadays as PDF. With around 2.5 million new research contributions every year, researchers drown in a flood of pseudo-digitized PDF publications. As a result research is seriously weakened. In this article, we argue for representing scholarly contributions in a structured and semantic way as a knowledge graph. The advantage is that information represented in a knowledge graph is readable by machines and humans. As an example, we give an overview on the Open Research Knowledge Graph (ORKG), a service implementing this approach. For creating the knowledge graph representation, we rely on a mixture of manual (crowd/expert sourcing) and (semi-)automated techniques. Only with such a combination of human and machine intelligence, we can achieve the required quality of the representation to allow for novel exploration and assistance services for researchers. As a result, a scholarly knowledge graph such as the ORKG can be used to give a condensed overview on the state-of-the-art addressing a particular research quest, for example as a tabular comparison of contributions according to various characteristics of the approaches. Further possible intuitive access interfaces to such scholarly knowledge graphs include domain-specific (chart) visualizations or answering of natural language questions.Der Verbreitung wissenschaftlicher Erkenntnisse hat sich seit vielen hundert Jahren nicht grundlegend verändert: Er erfolgt in der Regel dokumentenbasiert - früher als klassischer Aufsatz auf Papier gedruckt und heute online als PDF. Mit rund 2,5 Millionen neuen Forschungsbeiträgen pro Jahr ertrinken Forscher in einer Flut von pseudo-digitalisierten PDF-Publikationen. Als Folge davon wird die Forschung stark geschwächt. In diesem Artikel plädieren wir dafür, wissenschaftliche Beiträge in strukturierter und semantischer Form als Wissensgraph zu repräsentieren. Der Vorteil ist, dass die in einem Wissensgraph dargestellten Informationen für Maschinen und Menschen lesbar sind. Als Beispiel geben wir einen Überblick über den Open Research Knowledge Graph (ORKG), einen Dienst, der diesen Ansatz umsetzt. Für die Erstellung des Wissensgraph setzen wir eine Mischung aus manuellen (crowd/expert sourcing) und (halb-)automatisierten Techniken ein. Nur mit einer solchen Kombination aus menschlicher und maschineller Intelligenz können wir die erforderliche Qualität der Darstellung erreichen, um neuartige Explorations- und Unterstützungsdienste für Forscher zu ermöglichen. Im Ergebnis kann ein Wissensgraph wie der ORKG verwendet werden, um einen komprimierten Überblick über den Stand der Technik in Bezug auf eine bestimmte Forschungsaufgabe zu geben, z.B. als tabellarischer Vergleich der Beiträge nach verschiedenen Merkmalen der Ansätze. Weitere mögliche intuitive Nutzungsschnittstellen zu solchen wissenschaftlichen Wissensgraphen sind domänenspezifische Visualisierungen oder die Beantwortung natürlichsprachlicher Fragen mittels Question Answering.Peer Reviewe

Dokumenten-Publikationsserver der Humboldt-Universität zu Berlin